Purposeful Selection of Variables in Logistic Regression: Macro and Simulation Results

نویسندگان

  • Zoran Bursac
  • C. Heath Gauss
  • D. Keith Williams
  • David Hosmer
چکیده

The main problem in any model-building situation is to choose from a large set of covariates those that should be included in the “best” model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms embedded in SAS PROC LOGISTIC. Those methods are mechanical and as such carry some limitations. Hosmer and Lemeshow [2000] describe a purposeful selection of covariates algorithm within which an analyst makes a variable selection decision at each step of the modeling process. In this paper we introduce a macro, %PurposefulSelection, which automates this process. We conduct a simulation study to compare the performance of this algorithm with three well documented variable selection procedures in SAS PROC LOGISTIC: FORWARD, BACKWARD, and STEPWISE. Results and implications are discussed in more detail.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Purposeful Selection of Variables Macro for Logistic Regression

The main problem in any model-building situation is to choose from a large set of covariates those that should be included in the “best” model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms embedded in SAS PROC LOGISTIC. Those methods are mechanical and as such carry some limitations. Hosmer...

متن کامل

Augmented Backward Elimination: A Pragmatic and Purposeful Way to Develop Statistical Models

Statistical models are simple mathematical rules derived from empirical data describing the association between an outcome and several explanatory variables. In a typical modeling situation statistical analysis often involves a large number of potential explanatory variables and frequently only partial subject-matter knowledge is available. Therefore, selecting the most suitable variables for a...

متن کامل

A SAS Macro for Hosmer and Lemeshow’s Purposeful Selection Model Building Algorithm: Description and Performance

A common problem in many model-building situations is to choose from a large set of covariates that should be included in the “best” model. An additional consideration in modeling epidemiological data is the inclusion of confounders, which adds a quirk in the modeling procedure in that statistical significance is not the main criteria for keeping predictors in a model. Hosmer and Lemeshow (2000...

متن کامل

Penalized Bregman Divergence Estimation via Coordinate Descent

Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...

متن کامل

Effects of Multicollinearity in All Possible Mixed Model Selection

The effects of multicollinearity in all possible model selection of fixed effects including quadratic and cross products in the presence of random and repeated measures effects are presented here. The user-friendly SAS macro application ALLMIXED2 complements the model selection option currently available in the SAS macro applications ‘REGDIAG’ and ‘LOGISTIC’ for multiple linear and logistic reg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007